An automated method for the assessment of the rice grain germination rate

The germination rate of rice grain is recognized as one of the most significant indicators of seed quality assessment. Currently, grain germination rate is generally determined manually by experienced researchers, which is time-consuming and labor-intensive. In this paper, a new method is proposed for counting the number of grains and germinated grains. In the coarse segmentation process, the k-means clustering algorithm is applied to obtain rough grain-connected regions. We further refine the segmentation results obtained by the k-means algorithm using a one-dimensional Gaussian filter and a fifth-degree polynomial. Next, the optimal single grain area is determined based on the area distribution curve. Accordingly, the number of grains contained in the connected region is equal to the area of the connected region divided by the optimal single grain area. Finally, a novel algorithm is proposed for counting germinated grains. This algorithm is based on the idea that the length of the intersection between the germ and the grain is less than the circumference of the germ. The experimental results show that the mean absolute error of the proposed method for germination rate is 2.7%. And the performance of the proposed method is robust to changes in grain number, grain varieties, scale, illumination, and rotation.


Introduction
Rice is one of the three most important food crops, and its quality mainly depends on the quality of the seeds. In agricultural production, the germination rate of grains is often used as one of the important indicators for judging seed quality. Therefore, an accurate assessment of grain germination rate will help to accurately assess seed quality.
In the process of multi-batch measurement, the seed germination rate is generally determined manually by experienced researchers, in which the number of grains and the germinated grains are counted separately. This work is time-consuming and laborious, and the standard for determining the germination of grains is susceptible to human factors, which makes it difficult for different personnel to achieve consistent results and the experiment unrepeatable.
In recent years, the development of image processing and pattern recognition technology has provided conditions for the automatic quantitative assessment of seed germination rate.

PLOS ONE
PLOS ONE | https://doi.org/10.1371/journal.pone.0279934 January 3, 2023 1 / 23 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The grain germination rate assessment based on image processing can be divided into the following three steps: (1) grain segmentation; (2) grain counting; (3) germinated grain counting. The segmentation of grains is to regard grains as the foreground and separate them from the background. In this way, a binary image can be used to represent the grain image, which provides convenience for subsequent grain counting. One of the difficulties of grain segmentation is that the color of the grain is easily affected by illumination changes. When the illumination is strong, the color of the grain tends to be white. On the contrary, when the illumination is dark, the color of the grain tends to be gray. To reduce the illumination influence, Wu et al. [1] transformed the image from RGB color space to Lab color space, used two color difference components a and b to represent pixels, and then exploited k-means clustering algorithm for grain segmentation. The segmentation effect of this method is affected by the initial clustering center, and the randomly selected initial clustering center may make the final segmentation result wrong. In addition, active contour model is often used for grain segmentation [2].
Most grains appear as a cluster of multiple touching grains in the grain image. If we can remove the adhesions between the grains while separating the grains from the image, the number of grains can be expressed in terms of the number of connected regions in the binary image. The watershed-based segmentation method [3] can perform grain segmentation, but it is prone to over-segmentation when there is noise and rough edges in the image. In contrast, serious adhesions between grains can cause the under-segmentation problem. Some researchers propose to combine the watershed algorithm with other methods to solve the above problems. Duarte et al. [4] used an undirected graph to represent the segmentation results after watershed transformation and used a hierarchical social metaheuristic to improve image segmentation quality. Wang et al. [5] constructed the internal markers through a series of morphological operations to solve the over-segmentation problem in watershed transformation. Yang et al. [6] incorporated gray-scale gradient information into the watershed transformation to segment the overlapping objects.
The touching grains generally form concave points at the adhesive part between the grains, and the segmentation line formed by the concave point pair can separate the grains [7]. Generally, the concave point can be detected based on the curvature of the grain contour and the corner points. However, rough edges will seriously affect the curvature of the contour, which will cause errors in concave point detection. To accurately detect the concave points, Mebatsion et al. [8] used an elliptic Fourier series approximation to smooth the boundary contours of the image. In addition, Lin et al. [9] smoothed the contour curve by convolving the curve with a Gaussian kernel function. Liu et al. [10] used a circular template with a fixed radius to move along the contour and detect the concave points based on the response value. However, a circular template with a fixed radius is likely to result in a missed or wrong detection. In addition, another difficulty in the separation process is concave point matching. The currently proposed matching rules for concave point pairs include the nearest-neighbor criterion [11] and the radian critical distance criterion [12].
The initial method used for seed germination detection is to take the original grain image as a template and compare the germinated grain with the ungerminated grain [13]. In this method, the camera position is required to be fixed and the seeds need to be neatly arranged in equal intervals with adhesions. And the image acquisition system is not conducive to popularization due to many restrictions. Mladenov et al. [14] proposed to assess the seed germination with neural network. And the seeds are also not allowed to touch each other. The seeds with roots are regarded as germinated seeds in the method proposed by Khoenkaw [15]. In this method, the distance between the camera and the seed needs to be fixed and whether there is a germ is determined by a preset threshold, which also brings inconvenience to practical application. Recently, convolutional neural networks have been proposed and applied to classify objects, such as SSD [16], YOLO [17], R-CNN [18], and Fast R-CNN [18]. Genze et al. [19] used Fast R-CNN to distinguish whether the grains have germinated. To facilitate labeling, these methods require that the grains in the training images or test images are independent of each other, which also brings inconvenience to actual use.
The most commonly used methods including watershed algorithm, concave point detection, and active contour model, try to separate the adhered grains to obtain the accurate grain number. However, grains touch each other in a complex pattern and the intersection between the adhesion grains forms different strong or weak edges. And these methods are not applicable in all complex adhesion situations. To solve this problem, the basic idea of the proposed algorithm is as follows: Assuming that we separate all the grains from the image and use a binary image to represent the grains, in which either isolated grains or adhesion grains can be regarded as the connected regions with different areas. If the area of a single grain is denoted as s 0 , the number of grains contained in the connected region is equal to the area of the connected region divided by s 0 . Then we can obtain the total number of grains by adding the grain number contained in all connected regions. The advantage of this is that the operation of removing the adhesion between the grains can be avoided.
In addition, most of the aforementioned methods of grain germination detection require a fixed camera position and the grains keep a certain distance from each other without any adhesion. But it is difficult to meet such requirements in actual use. The basic idea of grain germination detection in this paper is as follows: We can further separate the grain germ (or radicle) part according to the color difference between the yellow grain and the white germ (or radicle). Then, the germs (or radicles) are further selected according to the following characteristics: (1) There is a certain proportional relationship between the area of the germ (or radicle) and the area of a single grain s 0 ; (2) The germ (or radicle) must be attached to the grain; (3) The length of the intersection between the germ (or radicle) and the grain is less than the circumference of the germ (or radicle).
The rice grain germination rate assessment system proposed in this paper has low requirements in the experimental environment. It can be used for shooting at any time with a mobile phone. The server completes the assessment of the germination rate and returns the results to the mobile terminal. The whole process takes about 1 to 2 seconds.

Test materials
First, prepare a piece of less reflective black paper as the background. The black background has a large difference in color in comparison to the yellow grains and white germs (or radicles), which is convenient for subsequent recognition. Next, prepare a less reflective black container with some water and grains. If the container is highly reflective, it will interfere with subsequent grain recognition. Lastly, the containers are placed in a thermostat for dark cultivation at 25˚C and photographed after 24 to 48 hours.

Software development tools
We develop web applications in Python. Users can access the URL (http://117.68.114.143:82/) with a browser through the computer, laptop computer, or mobile phone, select the image to be recognized and upload it to the server. And the server runs the recognition algorithm, and finally, the data such as germination rate is sent back to the client. The results of numerous experiments demonstrate that the proposed algorithm in this paper is robust to variation in scale, rotation, and illumination. The system diagram is shown in Fig 1.

Image capture
When capturing an image, the user can place the phone on top of the container and keep the phone as horizontal as possible. Only the black background and the container are included in the captured image. The distance between the phone and the container can be adjusted up and down as needed.

Coarse segmentation
In coarse segmentation, we need to segment the grains from the image and convert the RGB color image into a binary grain image. As shown in Fig  are larger, the corresponding pixel color is yellow, otherwise, it is black or white. The distribution of R(x,y)−B(x,y) and G(x,y)−B(x,y) of the grains is quite different from each other due to different illumination and grain soaking time. Therefore, the grains cannot be correctly segmented by threshold. In this paper, k-means clustering algorithm [20] is used for grain coarse segmentation. First, represent the attributes of pixel by r b and g b .
K-means algorithm can classify the pixels according to the distance between each pixel and the cluster center. Since the R−B span is large and the color of the pixels on the border of the grain is close to white or black, when K = 2, the border pixels of the grain may be wrongly classified as background. To avoid this error, we reduce the R−B span by Eq (1). The r b value is uniformly set to 30 when the R−B value is greater than or equal to 60, otherwise, it remains unchanged. The same is true for the g b value. Algorithm 1 lists the clustering process of the kmeans algorithm. Two initial cluster centers are fixed at (0,0) and (40,40) respectively to avoid the effect of a random selection of the initial cluster centers on the clustering effect.
assign pixel p i to cluster C k 5 endfor 6 Compute the new centroid (mean) of each cluster 7 until the centroid positions do not change   Fig 2A and 2B, that is, the white part (grain connected regions) in the binary image is represented by the corresponding pixel color in the RGB image. To further observe the coarse segmentation results, we magnify the grain in the purple rectangle, as shown in Fig 3. We can see from Fig 3 that the grain boundaries are relatively rough after coarse segmentation, that is, the separated foreground grains include the core part of the grain (bright area) and the border part of the grain (dark domain). This is mainly due to the fact that there exist shadows around the grain boundaries and the color of the shadow is closer to that of the yellow grain. Hence the shadows are classified as part of the grains in kmeans clustering. Therefore, the clustering results of k-means need to be refined. Fig 4A shows  The border part of the grain corresponds to peak point A whose red component value is 85. And the core part of the grain corresponds to peak point B whose red component value is 234. There is a valley point C between the two peak points A and B. The red component value of point C can be regarded as the threshold d separating the core part and border part of the grain. In the refinement segmentation stage, the pixels in the grain connected region whose red component value is less than d are further classified into the background.

Refinement segmentation
Because the histogram shown in Fig 4A is not smooth, it is difficult to locate the valley point directly from the original histogram. We use a one-dimensional Gaussian filter [21] to filter the histogram. Fig 4B shows the histogram filtered by a Gaussian filter. The Gaussian filter is defined as: where σ is the standard deviation. The fluctuation of the histogram curve cannot be reduced in most cases if we use a smaller filter size. As a consequence, too many valley points will be detected, which will lead to wrong segmentation. Hence, we use a Gaussian filter of size 100 with a standard deviation of 7.5 in this paper.
After applying a Gaussian filter on a histogram, we found from experiments that there are multiple valley points in some histograms. As shown in Fig 5A, there are three valley points C 1 , C 2 , and C 3 between the peak points A and B, which brings difficulties to the determination of the threshold. To further reduce the number of peak points and valley points, we perform curve fitting [22] on the smoothed histogram. Since we need to keep the true peak points and valley points, we select a 5th-degree polynomial to fit the histogram: where a, b, c, d, e, and f are parameters whose values can be obtained according to the leastsquares method. Fig 5B shows

Area of single grain determination
After the refined segmentation, we obtain the grain connected regions of different sizes. The number of pixels contained in the connected region can be regarded as the area of the region, denoted as s. In addition, the area of a single grain is denoted as s 0 . When the grains have uniform grain size distribution and there is no occlusion between the grains, the number of grains in the connected region is equal to s/s 0 . Next, we need to determine s 0 .
Assuming that a grain binary image I B contains m grain connected regions. We sort the areas of m connected regions in ascending order. Fig 6A shows the area distribution curve after sorting. In Fig 6A, the area of the connected region located to the left of point A tends to be 0. These regions contain the noise in the image, which can be ignored when counting the number of grains. Points A and B in Fig 6A can be regarded as the turning points of the area distribution curve. It can be found that the curve between points A and B remains stable. Because there must be a part of connected regions composed of a single grain in the binary grain image, each connected region between A and B corresponds to the area of a single grain. The areas s A and s B for points A and B represent the minimum area and maximum area of all individual grains, respectively.
To determine the position of point A, we first use a one-dimensional Gaussian filter [23] with a length of 3 and σ of 0.65 to smooth the area distribution curve shown in Fig 6A. The result is shown in Fig 6B. Then one-dimensional Laplacian operator is applied on the smoothed curve, where the one-dimensional Laplacian operator can be expressed as: where f() denotes the smoothed area distribution curve. After the Laplacian operator processing, we obtain the new curve, denoted by g(x), as shown in Fig 6C. We start from the curve, the first point g(i) that satisfies the following condition can be considered as point A: gði À 1Þ < 0 and gði À 1Þ < gðiÞ and gði þ 1Þ > 0 and gði þ 1Þ > gðiÞ ð6Þ Suppose that the area of the ith point on the area distribution curve is s i , correspondingly, the area of the (i+3)th point is s i+3 . We start from point A, the point on the area distribution curve that satisfies the following condition can be regarded as the point B: If point B that meets condition (7) cannot be found, it is considered that each grain connected region only contains a single grain, so the last point on the area distribution curve is taken as point B.
If s A is taken as the area of a single grain, the obtained grain number will be larger than the true value since most grain area is larger than s A . In contrast, if s B is taken as the area of a single grain, the obtained grain number will be smaller than the true value. And if we take the average value (s A +s B )/2 as the area of a single grain, it also cannot guarantee that the grain number is closest to the true value. To make the grain number closest to the true value, we propose Algorithm 2 to determine the optimal single grain area s opt . Assuming that the area of the ith connected region is s i and the number of grains in this connected region is equal to b s i s opt c. The result of s i s opt contains an integer part and a decimal part. If only the integer part is regarded as the number of grains in the connected region, the cumulative error will be accumulated when the number of grains contained in the connected region is large. To reduce the error, the number of grains in the connected region is equal to b s i s opt c þ 1 when the decimal part is greater than the threshold 0.4, otherwise, the number of grains is equal to b s i s opt c. In this way, the total number of grains is equal to the sum of the number of grains contained in each connected region.

Germination detection
We need to separate the germs from the grain image to further count the number of germinated grains. Since the germ is white, the pixel whose gray value is greater than 160 in the nongrain image can be considered as the germ, where the non-grain image represents the image after removing the grains. The red part in Fig 7A represents the germ connected region. It can be seen from Fig 7A that, in addition to the normal germ part, the reflective part in the image is considered as the germ. The area of germ connected region s bud has a certain relationship with the single grain area s opt . When s bud is much smaller than s opt , the germ is too small to see by the naked eye and it is not detected as a germ. In contrast, the germ is not counted if s bud is much larger than s opt . Hence, we judge the germ according to the following condition after numerous experiments: The germ connected region satisfying Eq (8) is regarded as a germ and the results are shown in Fig 7B. It can be seen from Fig 7B that most germs are extracted correctly, but there is still some noise not belonging to the germ. And some grains that have not fully germinated are predicted as germinated. Therefore, the extracted germs need to be further selected. Suppose that the length of the intersection part between the germ and grain is denoted by l and the circumference of the germ is denoted by p. There is a certain proportional relationship between l and p. As shown in Fig 8A, the normally germinated grain has a low value of l/p. As shown in Fig 8B, the grain that has not fully germinated has a larger value of l/p. Algorithm 3 represents the pseudo-code of selecting and counting the germs.

Performance of grain counting method
As shown in Fig 9, we used three rice varieties, Fuliangyou 534, II you 7954, and Luyou 911 in the experiments. For convenience, these three varieties are denoted as V1, V2, and V3 respectively. Absolute and relative errors were used as evaluation indicators in the experiments: where A represents the true value and B represents the predicted value measured by the proposed algorithm.
To verify the accuracy of the proposed grain counting method, we collected 90 grain images of the above-mentioned rice varieties. These images include changes in the number of grains, scale, illumination, and rotation. Table 1 illustrates the mean relative error of grain number for the three rice varieties. As seen in Table 1, the average relative error value of grain count for the 90 images was 1.02% and the standard deviation was 1.26.
In addition, Fig 10 shows  We found from the experiments that the counting result may be larger than the true value if there are multiple grains with a larger area than the optimal single grain area. The true number of grains in the connected region shown in Fig 11A is equal to 3. Since the area of each grain in the connected region is larger than the optimal single grain area, the counting result is 4. In addition, the area of the connected area of grains in Fig 11B is small. This is mainly due to the white color of the grains, which makes the grain segmentation wrong. The counting result may be smaller than the true value if there are multiple grains with a smaller area than the optimal single grain area. As shown in Fig 11B, the true number of grains in the connected region is equal to 2. But the counting result is 1. Table 2 illustrates the mean absolute error of germination rate for all images in the three rice varieties. As seen in Table 2, the mean absolute error value of the germination rate for the 90 images was 2.7% and the standard deviation was 2.59.

Assessment of germination rate of grains
In addition, Fig 12 shows the results of the predicted and true values of germination rate for each image in rice varieties V1, V2, and V3. From Fig 12, it can be seen that the predicted values of germination rate obtained using the proposed method are close to the true values.
As shown in Fig 13A, we found from the experiments that the water droplet attached to the edge of the grain will reflect light, which makes the color of the water droplet appear white. Then the water droplet is wrongly regarded as the germ, as shown in Fig 13B.

Discussion
In this paper, the germ is considered as a connected region, and the germination rate is calculated by counting the number of germ connected regions. We found that the proposed algorithm would produce incorrect results when there is mutual contact or intertwining between the germs, as shown in Fig 14. This is primarily due to the fact that the germs of two different grains will appear as the same connected region. It is therefore important to avoid any contact or intertwining between the germs when arranging the grains in the container. Next, we will discuss germ-to-grain contact. In Fig 15, three cases of germ-to-grain contact and intertwining are illustrated. In Fig 15A, a small portion of the germ is visible above the grain. The area of the grain connected region is smaller than the actual value due to the removal of the germ from the region, which may affect the number of grains. According to Fig  15B, only a small portion of the germ is located below the grain. This has little effect on either the grain number or the germ number. Fig 15C shows that a large part of the germ lies beneath the grain, which will have a significant impact on the number of germs.
Then, we will discuss the grain-to-grain contact. In Fig 16A, the mutual touching of grains does not affect the area of the grain connected region, which indicates that the number of  grains is not affected by this situation. Fig 16B illustrates that the overlap between grains will affect the area of the grain connectivity area, thereby reducing the number of grains. Table 3 summarizes the effects of contact or overlap between grains or germs on germination rate. It can be seen from Table 3 that grain-to-grain contact and germ-to-grain contact are allowed, but all other forms of touching or overlapping are prohibited or partially allowed.
As shown in Fig 17, we selected four images out of 90 images that contain varying numbers of grains, 20, 40, 60, and 80. A total of three varieties of Fuliangyou 534, II you 7954, and Luyou 911 can be found within the four images. These four images also differ in their scale, illumination, and rotation. In addition, the proposed method had an average relative error of 1.02% and 2.7% when counting grain and germination number on 90 images, respectively. The application results indicate that the performance of the proposed method is robust to these variations.

Conclusion
This paper proposes a novel grain germination rate counting algorithm. In this paper, the algorithm proceeds in two steps, counting the number of grains and then counting the number of germs. First, the connected regions where the grains are located are obtained from coarse segmentation to refinement segmentation. The optimal area of each grain is then used to calculate the number of grains per connected region. Finally, the germ number is obtained based on the characteristics of the germ distribution.
We collected 90 grain images to validate the proposed algorithm, including three varieties of Fuliangyou 534, II you 7954, and Luyou 911. These images include changes in the number of grains, scale, illumination, and rotation. The mean absolute error value of the germination rate for the 90 images was 2.7%. The experimental results show that the proposed algorithm can accurately predict the germination rate of grains.
In the future, we plan to explore the dynamics of seed germination uniformity with the proposed algorithm. Moreover, we will optimize the proposed algorithm to be independent of custom color-based thresholds so that the prediction method can be better applied to different crops and light settings.  Table 3. The effects of contact or overlap between grains or germs on germination rate, where ' p ' means allowed and has no effect on germination rate, '×' means prohibited and has an effect on germination rate, '4' means partially allowed and may have an effect on germination rate.